Search CORE

313 research outputs found

The Tree Inclusion Problem: In Linear Space and Faster

Author: Alstrup S.
Alstrup S.
Alstrup S.
Alstrup S.
Bender M. A.
Cole R.
Demaine E. D.
Ferragina P.
Inge Li Gortz
Muthukrishnan S.
Philip Bille
Schlieder T.
Termier A.
Yang L. H.
Zezula P.
Publication venue
Publication date: 01/01/2011
Field of study

Given two rooted, ordered, and labeled trees

P

and

T

the tree inclusion problem is to determine if

P

can be obtained from

T

by deleting nodes in

T

. This problem has recently been recognized as an important query primitive in XML databases. Kilpel\"ainen and Mannila [\emph{SIAM J. Comput. 1995}] presented the first polynomial time algorithm using quadratic time and space. Since then several improved results have been obtained for special cases when

P

and

T

have a small number of leaves or small depth. However, in the worst case these algorithms still use quadratic time and space. Let

n_S

l_S

, and

d_S

denote the number of nodes, the number of leaves, and the %maximum depth of a tree

S \in \{P, T\}

. In this paper we show that the tree inclusion problem can be solved in space

O(n_T)

and time: O(\min(l_Pn_T, l_Pl_T\log \log n_T + n_T, \frac{n_Pn_T}{\log n_T} + n_{T}\log n_{T})). This improves or matches the best known time complexities while using only linear space instead of quadratic. This is particularly important in practical applications, such as XML databases, where the space is likely to be a bottleneck.Comment: Minor updates from last tim

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

Hardness of Exact Distance Queries in Sparse Graphs Through Hub Labeling

Author: Alstrup S.
Alstrup S.
Köhler E.
P.
Ruzsa I. Z.
Twigg A. D.
Publication venue
Publication date: 01/01/2019
Field of study

A distance labeling scheme is an assignment of bit-labels to the vertices of an undirected, unweighted graph such that the distance between any pair of vertices can be decoded solely from their labels. An important class of distance labeling schemes is that of hub labelings, where a node

v \in G

stores its distance to the so-called hubs

S_v \subseteq V

, chosen so that for any

u,v \in V

there is

w \in S_u \cap S_v

belonging to some shortest

uv

path. Notice that for most existing graph classes, the best distance labelling constructions existing use at some point a hub labeling scheme at least as a key building block. Our interest lies in hub labelings of sparse graphs, i.e., those with

|E(G)| = O(n)

, for which we show a lowerbound of

\frac{n}{2^{O(\sqrt{\log n})}}

for the average size of the hubsets. Additionally, we show a hub-labeling construction for sparse graphs of average size

O(\frac{n}{RS(n)^{c}})

for some

0 < c < 1

, where

RS(n)

is the so-called Ruzsa-Szemer{\'e}di function, linked to structure of induced matchings in dense graphs. This implies that further improving the lower bound on hub labeling size to

\frac{n}{2^{(\log n)^{o(1)}}}

would require a breakthrough in the study of lower bounds on

RS(n)

, which have resisted substantial improvement in the last 70 years. For general distance labeling of sparse graphs, we show a lowerbound of

\frac{1}{2^{O(\sqrt{\log n})}} SumIndex(n)

, where

SumIndex(n)

is the communication complexity of the Sum-Index problem over

Z_n

. Our results suggest that the best achievable hub-label size and distance-label size in sparse graphs may be

\Theta(\frac{n}{2^{(\log n)^c}})

for some

0<c < 1

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Squarepants in a Tree: Sum of Subtree Clustering and Hyperbolic Pants Decomposition

Author: Alstrup S.
Aluru S.
Bern M. W.
David Eppstein
Erickson J.
Saitou N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/02/2008
Field of study

We provide efficient constant factor approximation algorithms for the problems of finding a hierarchical clustering of a point set in any metric space, minimizing the sum of minimimum spanning tree lengths within each cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can also be used to provide a pants decomposition, that is, a set of disjoint simple closed curves partitioning the plane minus the input points into subsets with exactly three boundary components, with approximately minimum total length. In the Euclidean case, these curves are squares; in the hyperbolic case, they combine our Euclidean square pants decomposition with our tree clustering method for general metric spaces.Comment: 22 pages, 14 figures. This version replaces the proof of what is now Lemma 5.2, as the previous proof was erroneou

arXiv.org e-Print Archive

Crossref

A simple and optimal ancestry labeling scheme for trees

Author: A Deutsch
C Gavoillea
E Cohen
P Fraigniaud
Robert Endre Tarjan
S Abiteboul
S Alstrup
Publication venue
Publication date: 01/01/2015
Field of study

We present a

\lg n + 2 \lg \lg n+3

ancestry labeling scheme for trees. The problem was first presented by Kannan et al. [STOC 88'] along with a simple

2 \lg n

solution. Motivated by applications to XML files, the label size was improved incrementally over the course of more than 20 years by a series of papers. The last, due to Fraigniaud and Korman [STOC 10'], presented an asymptotically optimal

\lg n + 4 \lg \lg n+O(1)

labeling scheme using non-trivial tree-decomposition techniques. By providing a framework generalizing interval based labeling schemes, we obtain a simple, yet asymptotically optimal solution to the problem. Furthermore, our labeling scheme is attained by a small modification of the original

2 \lg n

solution.Comment: 12 pages, 1 figure. To appear at ICALP'1

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

2-Vertex Connectivity in Directed Graphs

Author: AL Buchsbaum
GF Italiano
H Nagamochi
K Menger
RE Tarjan
RE Tarjan
S Alstrup
Publication venue
Publication date: 19/02/2015
Field of study

We complement our study of 2-connectivity in directed graphs, by considering the computation of the following 2-vertex-connectivity relations: We say that two vertices v and w are 2-vertex-connected if there are two internally vertex-disjoint paths from v to w and two internally vertex-disjoint paths from w to v. We also say that v and w are vertex-resilient if the removal of any vertex different from v and w leaves v and w in the same strongly connected component. We show how to compute the above relations in linear time so that we can report in constant time if two vertices are 2-vertex-connected or if they are vertex-resilient. We also show how to compute in linear time a sparse certificate for these relations, i.e., a subgraph of the input graph that has O(n) edges and maintains the same 2-vertex-connectivity and vertex-resilience relations as the input graph, where n is the number of vertices.Comment: arXiv admin note: substantial text overlap with arXiv:1407.304

arXiv.org e-Print Archive

Crossref

ART

Compressed Subsequence Matching and Packed Tree Coloring

Author: A. Tiskin
A. Tiskin
D.D. Sleator
G. Das
H. Mannila
J. Ziv
J. Ziv
M. Charikar
M. Crochemore
M. Thorup
M.A. Bender
M.L. Fredman
N.J. Larsson
O. Berkman
P. Cégielski
P. Cégielski
P. Ferragina
P.F. Dietz
R.A. Baeza-Yates
S. Abiteboul
S. Alstrup
S. Alstrup
S. Alstrup
T. Yamamoto
W. Rytter
Z. Troníček
Publication venue
Publication date: 01/01/2014
Field of study

We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size

n

compressing a string of size

N

and a pattern string of size

m

over an alphabet of size

\sigma

, our algorithm uses

O(n+\frac{n\sigma}{w})

space and

O(n+\frac{n\sigma}{w}+m\log N\log w\cdot occ)

O(n+\frac{n\sigma}{w}\log w+m\log N\cdot occ)

time. Here

w

is the word size and

occ

is the number of occurrences of the pattern. Our algorithm uses less space than previous algorithms and is also faster for

occ=o(\frac{n}{\log N})

occurrences. The algorithm uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string. This data structure in turn is based on a new data structure for the tree color problem, where the node colors are packed in bit strings.Comment: To appear at CPM '1

arXiv.org e-Print Archive

CiteSeerX

Crossref

Online Research Database In Technology

Labeling Schemes for Bounded Degree Graphs

Author: A. Korman
C. Gavoille
C. Gavoille
C. Nash-Williams
F.R.K. Chung
L. Esperet
N. Bonichon
N. Bonichon
S. Alstrup
S. Bhatt
S. Butler
Y. Wang
Publication venue
Publication date: 01/01/2014
Field of study

We investigate adjacency labeling schemes for graphs of bounded degree

\Delta = O(1)

. In particular, we present an optimal (up to an additive constant)

\log n + O(1)

adjacency labeling scheme for bounded degree trees. The latter scheme is derived from a labeling scheme for bounded degree outerplanar graphs. Our results complement a similar bound recently obtained for bounded depth trees [Fraigniaud and Korman, SODA 10], and may provide new insights for closing the long standing gap for adjacency in trees [Alstrup and Rauhe, FOCS 02]. We also provide improved labeling schemes for bounded degree planar graphs. Finally, we use combinatorial number systems and present an improved adjacency labeling schemes for graphs of bounded degree

\Delta

with

(e+1)\sqrt{n} < \Delta \leq n/5

arXiv.org e-Print Archive

CiteSeerX

Crossref

Copenhagen University Research Information System

Tree Compression with Top Trees Revisited

Author: F Wang
G Busatto
JI Munro
M Charikar
M Hirakawa
M Lohrey
M Lohrey
NJ Larsson
P Ferragina
PJ Downey
S Alstrup
S Gog
S Maneth
S Maruyama
Publication venue
Publication date: 01/01/2015
Field of study

We revisit tree compression with top trees (Bille et al, ICALP'13) and present several improvements to the compressor and its analysis. By significantly reducing the amount of information stored and guiding the compression step using a RePair-inspired heuristic, we obtain a fast compressor achieving good compression ratios, addressing an open problem posed by Bille et al. We show how, with relatively small overhead, the compressed file can be converted into an in-memory representation that supports basic navigation operations in worst-case logarithmic time without decompression. We also show a much improved worst-case bound on the size of the output of top-tree compression (answering an open question posed in a talk on this algorithm by Weimann in 2012).Comment: SEA 201

arXiv.org e-Print Archive

Crossref

KITopen

Leicester Research Archive

A simpler and more efficient algorithm for the next-to-shortest path problem

Author: Bang Ye Wu
I. Krasiko
K.-H. Kao
K.N. Lalgudi
M. Thorup
M.L. Fredman
M.R. Henzinger
S. Alstrup
S. Li
S. Mondal
S.C. Barman
T.H. Cormen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/05/2011
Field of study

Given an undirected graph

G=(V,E)

with positive edge lengths and two vertices

s

and

t

, the next-to-shortest path problem is to find an

st

-path which length is minimum amongst all

st

-paths strictly longer than the shortest path length. In this paper we show that the problem can be solved in linear time if the distances from

s

and

t

to all other vertices are given. Particularly our new algorithm runs in

O(|V|\log |V|+|E|)

time for general graphs, which improves the previous result of

O(|V|^2)

time for sparse graphs, and takes only linear time for unweighted graphs, planar graphs, and graphs with positive integer edge lengths.Comment: Partial result appeared in COCOA201

arXiv.org e-Print Archive

Crossref

Dynamic and Multi-functional Labeling Schemes

Author: A Korman
A Korman
C Gavoille
C Gavoille
D Adjiashvili
D Peleg
D Peleg
E Cohen
M Lewenstein
N Rotbart
P Fraigniaud
S Alstrup
Publication venue
Publication date: 01/01/2014
Field of study

We investigate labeling schemes supporting adjacency, ancestry, sibling, and connectivity queries in forests. In the course of more than 20 years, the existence of

\log n + O(\log \log)

labeling schemes supporting each of these functions was proven, with the most recent being ancestry [Fraigniaud and Korman, STOC '10]. Several multi-functional labeling schemes also enjoy lower or upper bounds of

\log n + \Omega(\log \log n)

\log n + O(\log \log n)

respectively. Notably an upper bound of

\log n + 5\log \log n

for adjacency+siblings and a lower bound of

\log n + \log \log n

for each of the functions siblings, ancestry, and connectivity [Alstrup et al., SODA '03]. We improve the constants hidden in the

O

-notation. In particular we show a

\log n + 2\log \log n

lower bound for connectivity+ancestry and connectivity+siblings, as well as an upper bound of

\log n + 3\log \log n + O(\log \log \log n)

for connectivity+adjacency+siblings by altering existing methods. In the context of dynamic labeling schemes it is known that ancestry requires

\Omega(n)

bits [Cohen, et al. PODS '02]. In contrast, we show upper and lower bounds on the label size for adjacency, siblings, and connectivity of

2\log n

bits, and

3 \log n

to support all three functions. There exist efficient adjacency labeling schemes for planar, bounded treewidth, bounded arboricity and interval graphs. In a dynamic setting, we show a lower bound of

\Omega(n)

for each of those families.Comment: 17 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System